Georgina's Website ❤️

6. Eigenvectors and Diagonalization

6.1 Matrix of a Linear Mapping, Similar Matrices

A linear mapping L:RnRnL:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n} is sometimes called a linear operator. This is often done when we wish to stress the fact that the domain and codomain of the linear mapping are the same. Let B={v1,...,vn}\mathcal{B}=\{\vec{v}_{1},...,\vec{v}_{n}\} be a basis for Rn\mathbb{R}^{n} and let L:RnRnL:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n} be a linear operator. The B\mathcal{B}-matrix of L is defined to be [L]B=[[L(v1)]B[L(vn)]B][L]_{\mathcal{B}}=[[L(\vec{v}{1})]_{\mathcal{B}} \cdot\cdot\cdot [L(\vec{v}_{n})]_{\mathcal{B}}] It satisfies [L(x)]B=[L]B[x]B[L(\vec{x})]_{\mathcal{B}}=[L]_{\mathcal{B}}[\vec{x}]_{\mathcal{B}}.

e.g. Let L be a linear mapping with standard matrix [L]=[3113][L]=\begin{bmatrix}3&-1\\ -1&3\end{bmatrix} and B={[11],[11]}\mathcal{B}=\{[\begin{matrix}1\\ 1\end{matrix}],[\begin{matrix}1\\ -1\end{matrix}]\}. Find the B\mathcal{B}-matrix of L and use it to determine [L(x)]B[L(\vec{x})]_{\mathcal{B}} where [x]B=[11][\vec{x}]_{\mathcal{B}}=\begin{bmatrix}1\\ 1\end{bmatrix}. [L]B=[[L([11])]B[L([11])]B][L]_{\mathcal{B}}=[[L(\begin{bmatrix}1\\ 1\end{bmatrix})]_{\mathcal{B}} [L(\begin{bmatrix}1\\ -1\end{bmatrix})]{_\mathcal{B}}] L([11])=[3113][11]=[22]L(\begin{bmatrix}1\\ 1\end{bmatrix}) = \begin{bmatrix}3&-1\\ -1&3\end{bmatrix}\begin{bmatrix}1\\ 1\end{bmatrix} = \begin{bmatrix}2\\ 2\end{bmatrix}. To find [22]B[\begin{smallmatrix}2\\2\end{smallmatrix}]_{\mathcal{B}}, we solve c1[11]+c2[11]=[22]c_1\begin{bmatrix}1\\1\end{bmatrix} + c_2\begin{bmatrix}1\\-1\end{bmatrix} = \begin{bmatrix}2\\2\end{bmatrix}. c1+c2=2c_1+c_2=2 c1c2=2c_1-c_2=2 Adding equations: 2c1=4c1=22c_1=4 \Rightarrow c_1=2. Then 2+c2=2c2=02+c_2=2 \Rightarrow c_2=0. So [L(11)]B=[20][L(\begin{smallmatrix}1\\1\end{smallmatrix})]_{\mathcal{B}} = \begin{bmatrix}2\\0\end{bmatrix}. L([11])=[3113][11]=[44]L(\begin{bmatrix}1\\ -1\end{bmatrix}) = \begin{bmatrix}3&-1\\ -1&3\end{bmatrix}\begin{bmatrix}1\\ -1\end{bmatrix} = \begin{bmatrix}4\\ -4\end{bmatrix}. To find [44]B[\begin{smallmatrix}4\\-4\end{smallmatrix}]{\mathcal{B}}, we solve d1[11]+d2[11]=[44]d_1\begin{bmatrix}1\\1\end{bmatrix} + d_2\begin{bmatrix}1\\-1\end{bmatrix} = \begin{bmatrix}4\\-4\end{bmatrix}. d1+d2=4d_1+d_2=4 d1d2=4d_1-d_2=-4 Adding equations: 2d1=0d1=02d_1=0 \Rightarrow d_1=0. Then 0+d2=4d2=40+d_2=4 \Rightarrow d_2=4. So [L(11)]B=[04][L(\begin{smallmatrix}1\\-1\end{smallmatrix})]_{\mathcal{B}} = \begin{bmatrix}0\\4\end{bmatrix}.

So, [L]B=[2004][L]_{\mathcal{B}}=\begin{bmatrix}2&0\\ 0&4\end{bmatrix}. Then [L(x)]B=[L]B[x]B=[2004][11]=[24][L(\vec{x})]_{\mathcal{B}}=[L]_{\mathcal{B}}[\vec{x}]_{\mathcal{B}}=\begin{bmatrix}2&0\\ 0&4\end{bmatrix}\begin{bmatrix}1\\ 1\end{bmatrix}=\begin{bmatrix}2\\ 4\end{bmatrix}.

e.g. Let L:R2R2L:\mathbb{R}^{2}\rightarrow\mathbb{R}^{2} be the linear mapping defined by L(x1,x2)=(2x1+3x2,4x1x2)L(x_{1},x_{2})=(2x_{1}+3x_{2},4x_{1}-x_{2}) and B={[12],[11]}\mathcal{B}=\{[\begin{matrix}1\\ 2\end{matrix}],[\begin{matrix}1\\ -1\end{matrix}]\}. Find the B\mathcal{B}-matrix of L and use it to determine [L(x)]B[L(\vec{x})]_{\mathcal{B}} where [x]B=[11].[\vec{x}]_{\mathcal{B}}=\begin{bmatrix}1\\ 1\end{bmatrix}. The standard matrix is [L]=[2341][L]=\begin{bmatrix}2&3\\ 4&-1\end{bmatrix}. L([12])=[2341][12]=[2+642]=[82]L(\begin{bmatrix}1\\ 2\end{bmatrix}) = \begin{bmatrix}2&3\\ 4&-1\end{bmatrix}\begin{bmatrix}1\\ 2\end{bmatrix} = \begin{bmatrix}2+6\\ 4-2\end{bmatrix} = \begin{bmatrix}8\\ 2\end{bmatrix}. To find [82]B[\begin{smallmatrix}8\\2\end{smallmatrix}]_{\mathcal{B}}: c1[12]+c2[11]=[82]c_1\begin{bmatrix}1\\2\end{bmatrix} + c_2\begin{bmatrix}1\\-1\end{bmatrix} = \begin{bmatrix}8\\2\end{bmatrix}. c1+c2=8c_1+c_2=8 2c1c2=22c_1-c_2=2 Adding: 3c1=10c1=10/33c_1=10 \Rightarrow c_1=10/3. Then 10/3+c2=8c2=24/310/3=14/310/3+c_2=8 \Rightarrow c_2 = 24/3-10/3 = 14/3. So [L(12)]B=[10/314/3][L(\begin{smallmatrix}1\\2\end{smallmatrix})]_{\mathcal{B}} = \begin{bmatrix}10/3\\ 14/3\end{bmatrix}. L([11])=[2341][11]=[234+1]=[15]L(\begin{bmatrix}1\\ -1\end{bmatrix}) = \begin{bmatrix}2&3\\ 4&-1\end{bmatrix}\begin{bmatrix}1\\ -1\end{bmatrix} = \begin{bmatrix}2-3\\ 4+1\end{bmatrix} = \begin{bmatrix}-1\\ 5\end{bmatrix}. To find [15]B[\begin{smallmatrix}-1\\5\end{smallmatrix}]_{\mathcal{B}}: d1[12]+d2[11]=[15]d_1\begin{bmatrix}1\\2\end{bmatrix} + d_2\begin{bmatrix}1\\-1\end{bmatrix} = \begin{bmatrix}-1\\5\end{bmatrix}. d1+d2=1d_1+d_2=-1 2d1d2=52d_1-d_2=5 Adding: 3d1=4d1=4/33d_1=4 \Rightarrow d_1=4/3. Then 4/3+d2=1d2=3/34/3=7/34/3+d_2=-1 \Rightarrow d_2 = -3/3-4/3 = -7/3. So [L(11)]B=[4/37/3][L(\begin{smallmatrix}1\\-1\end{smallmatrix})]_{\mathcal{B}} = \begin{bmatrix}4/3\\ -7/3\end{bmatrix}. Then [L]B=[10/34/314/37/3][L]_{\mathcal{B}} = \begin{bmatrix}10/3 & 4/3 \\ 14/3 & -7/3\end{bmatrix}. (Using my corrected c2c_2) [L(x)]B=[10/34/314/37/3][11]=[10/3+4/314/37/3]=[14/37/3][L(\vec{x})]_{\mathcal{B}} = \begin{bmatrix}10/3 & 4/3 \\ 14/3 & -7/3\end{bmatrix} \begin{bmatrix}1\\1\end{bmatrix} = \begin{bmatrix}10/3+4/3 \\ 14/3-7/3\end{bmatrix} = \begin{bmatrix}14/3 \\ 7/3\end{bmatrix}. An n×nn\times n matrix D is said to be a diagonal matrix if dij=0d_{ij}=0 for all iji\ne j. We denote a diagonal matrix by diag(d11,d22,...,dnn)diag(d_{11},d_{22},...,d_{nn}). The matrix [2004]\begin{bmatrix}2&0\\ 0&4\end{bmatrix} is a diagonal matrix.

Theorem 6.1.7

If A and B are n×nn\times n matrices such that P1AP=BP^{-1}AP=B for some invertible matrix P, then

  1. rank A=A= rank B.
  2. det A=A= det B.
  3. tr A=tr BA=\text{tr }B where tr A is defined by tr A=i=1naii\text{tr }A=\sum_{i=1}^{n}a_{ii} and is called the trace of a matrix.

If A and B be n×nn\times n matrices such that P1AP=BP^{-1}AP=B for some invertible matrix P, then A is said to be similar to B. If A is similar to B, prove that AnA^{n} is similar to BnB^{n}.

6.2 Eigenvalues and Eigenvectors

Let A be an n×nn\times n matrix. If there exists a vector v0\vec{v}\ne\vec{0} such that Av=λvA\vec{v}=\lambda\vec{v}, then the scalar λ\lambda is called an eigenvalue of A and v\vec{v} called an eigenvector of A corresponding to λ\lambda. The pair (λ,v)(\lambda,\vec{v}) is called an eigenpair.

Let L:RnRnL:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n} be a linear operator. If there exists a vector v0\vec{v}\ne\vec{0} such that L(v)=λvL(\vec{v})=\lambda\vec{v} then λ\lambda is called an eigenvalue of L and v\vec{v} is called an eigenvector of L corresponding to λ\lambda. Why do you think we have the v0\vec{v}\ne\vec{0} condition? (If v=0\vec{v}=\vec{0}, then A0=λ0A\vec{0}=\lambda\vec{0} becomes 0=0\vec{0}=\vec{0} for any λ\lambda, which is trivial and doesn't define unique eigenvalues.)

e.g. Consider again the linear mapping L:R2R2L:\mathbb{R}^{2}\rightarrow\mathbb{R}^{2} with standard matrix [L]=[3113][L]=\begin{bmatrix}3&-1\\ -1&3\end{bmatrix}. As we saw: L(1,1)=[L][11]=[3113][11]=[22]=2[11]L(1,1)=[L]\begin{bmatrix}1\\ 1\end{bmatrix}=\begin{bmatrix}3&-1\\ -1&3\end{bmatrix}\begin{bmatrix}1\\ 1\end{bmatrix}=\begin{bmatrix}2\\ 2\end{bmatrix}=2\begin{bmatrix}1\\ 1\end{bmatrix} L(1,1)=[L][11]=[3113][11]=[44]=4[11]L(1,-1)=[L]\begin{bmatrix}1\\ -1\end{bmatrix}=\begin{bmatrix}3&-1\\ -1&3\end{bmatrix}\begin{bmatrix}1\\ -1\end{bmatrix}=\begin{bmatrix}4\\ -4\end{bmatrix}=4\begin{bmatrix}1\\ -1\end{bmatrix} Thus, (2,[11])(2,\begin{bmatrix}1\\ 1\end{bmatrix}) and (4,[11])(4,\begin{bmatrix}1\\ -1\end{bmatrix}) are eigenpairs of [L]. Also, 2 is an eigenvalue of L with eigenvector [11]\begin{bmatrix}1\\ 1\end{bmatrix}, while 4 is another eigenvalue with corresponding eigenvector [11]\begin{bmatrix}1\\ -1\end{bmatrix}.

e.g. Determine which of the following vectors are eigenvectors of A=[211323110]A=\begin{bmatrix}2&1&1\\ -3&-2&-3\\ -1&-1&0\end{bmatrix}. (a) v1=[111]\vec{v}_{1}=\begin{bmatrix}1\\ 1\\ 1\end{bmatrix}. Av1=[2+1+132311+0]=[482]A\vec{v}_{1}=\begin{bmatrix}2+1+1\\ -3-2-3\\ -1-1+0\end{bmatrix}=\begin{bmatrix}4\\ -8\\ -2\end{bmatrix}. This is not λv1\lambda\vec{v}_1. No (b) v2=[131]\vec{v}_{2}=\begin{bmatrix}-1\\ 3\\ 1\end{bmatrix}. Av2=[2(1)+1(3)+1(1)3(1)+(2)(3)+(3)(1)1(1)+(1)(3)+0(1)]=[2+3+136313+0]=[262]A\vec{v}_{2}=\begin{bmatrix}2(-1)+1(3)+1(1)\\ -3(-1)+(-2)(3)+(-3)(1)\\ -1(-1)+(-1)(3)+0(1)\end{bmatrix}=\begin{bmatrix}-2+3+1\\ 3-6-3\\ 1-3+0\end{bmatrix}=\begin{bmatrix}2\\ -6\\ -2\end{bmatrix}. This is (2)[131]=(2)v2(-2)\begin{bmatrix}-1\\3\\1\end{bmatrix} = (-2)\vec{v}_2. So v2\vec{v}_2 is an eigenvector with eigenvalue λ=2\lambda=-2. Yes (c) v3=[101]\vec{v}_{3}=\begin{bmatrix}1\\ 0\\ -1\end{bmatrix}. Av3=[2(1)+1(0)+1(1)3(1)+(2)(0)+(3)(1)1(1)+(1)(0)+0(1)]=[213+31]=[101]A\vec{v}_{3}=\begin{bmatrix}2(1)+1(0)+1(-1)\\ -3(1)+(-2)(0)+(-3)(-1)\\ -1(1)+(-1)(0)+0(-1)\end{bmatrix}=\begin{bmatrix}2-1\\ -3+3\\ -1\end{bmatrix}=\begin{bmatrix}1\\ 0\\ -1\end{bmatrix}. This is (1)[101]=(1)v3(1)\begin{bmatrix}1\\0\\-1\end{bmatrix} = (1)\vec{v}_3. So v3\vec{v}_3 is an eigenvector with eigenvalue λ=1\lambda=1. Yes (d)If (λ,v)(\lambda,\vec{v}) is an eigenpair of A, then is (2λ,2v)(2\lambda,2\vec{v}) another eigenpair of A? No. A(2v)=2(Av)=2(λv)=(2λ)vA(2\vec{v}) = 2(A\vec{v}) = 2(\lambda\vec{v}) = (2\lambda)\vec{v}. For (2λ,2v)(2\lambda, 2\vec{v}) to be an eigenpair, we would need A(2v)=(2λ)(2v)=4λvA(2\vec{v}) = (2\lambda)(2\vec{v}) = 4\lambda\vec{v}. This means 2λv=4λv2\lambda\vec{v} = 4\lambda\vec{v}, which implies 2λv=02\lambda\vec{v}=\vec{0}. Since v0\vec{v}\ne\vec{0}, then 2λ=0λ=02\lambda=0 \Rightarrow \lambda=0. So only if λ=0\lambda=0. No, only true when λ=0\lambda=0 or v=0\vec{v}=0. Since v0\vec{v} \ne \vec{0}, only when λ=0\lambda=0.

e.g. Can you imagine a scenario where (λ,v1)(\lambda,\vec{v}_{1}) and (λ,v2)(\lambda,\vec{v}_{2}) are eigenpairs of A, with v1v2\vec{v}_{1}\ne\vec{v}_{2}? Yes. For example, A=[3003]A=\begin{bmatrix}3&0\\ 0&3\end{bmatrix}, λ=3\lambda=3. v1=[10]\vec{v}_{1}=\begin{bmatrix}1\\ 0\end{bmatrix} and v2=[01]\vec{v}_{2}=\begin{bmatrix}0\\ 1\end{bmatrix}.

From the definition, an eigenpair of A, (λ,v)(\lambda,\vec{v}), requires v0\vec{v}\ne\vec{0} with Av=λvA\vec{v}=\lambda\vec{v}. Avλv=0A\vec{v}-\lambda\vec{v}=\vec{0} AvλIv=0A\vec{v}-\lambda I\vec{v}=\vec{0} (AλI)v=0(A-\lambda I)\vec{v}=\vec{0} where v\vec{v} is a solution to the homogeneous system [AλI0][A-\lambda I|\vec{0}]. Since we need v0\vec{v}\ne\vec{0} the eigenvalue λ\lambda exists if and only if this system has infinitely many solutions (i.e., non-trivial solutions). In turn, this means AλIA-\lambda I is not invertible, or, det(AλI)=0.det(A-\lambda I)=0. Once such a λ\lambda has been found, we can determine its associated eigenvectors by solving the homogeneous system (AλI)v=0(A-\lambda I)\vec{v}=\vec{0}.

Find all the eigenvalues of A=[0110]A=\begin{bmatrix}0&1\\ 1&0\end{bmatrix}. Determine all eigenvectors associated with each eigenvalue. det(AλI)=λ11λ=(λ)(λ)(1)(1)=λ21det(A-\lambda I)=\begin{vmatrix}-\lambda&1\\ 1&-\lambda\end{vmatrix}=(-\lambda)(-\lambda)-(1)(1)=\lambda^{2}-1. Set λ21=0(λ1)(λ+1)=0\lambda^{2}-1=0 \Rightarrow (\lambda-1)(\lambda+1)=0. So, λ1=1,λ2=1\lambda_{1}=1, \lambda_{2}=-1.

For λ1=1:(AI)v=0\lambda_{1}=1: (A-I)\vec{v}=\vec{0} AI=[1111]A-I=\begin{bmatrix}-1&1\\ 1&-1\end{bmatrix}. RREF is [1100]\begin{bmatrix}1&-1\\ 0&0\end{bmatrix}. So x1x2=0x1=x2x_1-x_2=0 \Rightarrow x_1=x_2. Let x2=sx_2=s. Then v=s[11]\vec{v}=s\begin{bmatrix}1\\ 1\end{bmatrix}. Eigenvectors are s[11]s\begin{bmatrix}1\\1\end{bmatrix} for s0s \ne 0.

For λ2=1:(A(1)I)v=(A+I)v=0\lambda_{2}=-1: (A-(-1)I)\vec{v}=(A+I)\vec{v}=\vec{0} A+I=[1111]A+I=\begin{bmatrix}1&1\\ 1&1\end{bmatrix}. RREF is [1100]\begin{bmatrix}1&1\\ 0&0\end{bmatrix}. So x1+x2=0x1=x2x_1+x_2=0 \Rightarrow x_1=-x_2. Let x2=sx_2=s. Then v=s[11]\vec{v}=s\begin{bmatrix}-1\\ 1\end{bmatrix}. Eigenvectors are s[11]s\begin{bmatrix}-1\\1\end{bmatrix} for s0s \ne 0.

Let A be an n×nn\times n matrix. The characteristic polynomial of A is the n-th degree polynomial CA(λ)=det(AλI)C_{A}(\lambda)=det(A-\lambda I). If there is no risk of confusion, we will sometimes write C(λ)C(\lambda) instead of CA(λ)C_{A}(\lambda).

Theorem 6.2.8

A scalar λ\lambda is an eigenvalue of an n×nn\times n matrix A if and only if CA(λ)=0C_{A}(\lambda)=0.

Let A be an n×nn\times n matrix with eigenvalue λ\lambda. We call the nullspace of AλIA-\lambda I the eigenspace of A corresponding to λ\lambda. The eigenspace is denoted EλE_{\lambda}.

e.g. Find all eigenvalues and a basis for each eigenspace for A=[121012002]A=\begin{bmatrix}1&2&1\\ 0&1&2\\ 0&0&-2\end{bmatrix}. CA(λ)=det(AλI)=1λ2101λ2002λC_{A}(\lambda)=det(A-\lambda I)=\begin{vmatrix}1-\lambda&2&1\\ 0&1-\lambda&2\\ 0&0&-2-\lambda\end{vmatrix}. Since A is upper triangular, the determinant is the product of the diagonal entries: CA(λ)=(1λ)(1λ)(2λ)=(1λ)2(2λ)C_{A}(\lambda)=(1-\lambda)(1-\lambda)(-2-\lambda)=(1-\lambda)^{2}(-2-\lambda). Set CA(λ)=0(1λ)2(2λ)=0C_{A}(\lambda)=0 \Rightarrow (1-\lambda)^{2}(-2-\lambda)=0. Eigenvalues are λ1=1\lambda_{1}=1 (with algebraic multiplicity 2) and λ2=2\lambda_{2}=-2 (with algebraic multiplicity 1). For λ1=1:(AI)v=0\lambda_{1}=1: (A-I)\vec{v}=\vec{0} AI=[112101120021]=[021002003]A-I = \begin{bmatrix}1-1&2&1\\ 0&1-1&2\\ 0&0&-2-1\end{bmatrix} = \begin{bmatrix}0&2&1\\ 0&0&2\\ 0&0&-3\end{bmatrix}. RREF: [021002003]R2/2[021001003]R1R2,R3+3R2[020001000]R1/2[010001000]\begin{bmatrix}0&2&1\\ 0&0&2\\ 0&0&-3\end{bmatrix} \xrightarrow{R_2/2} \begin{bmatrix}0&2&1\\ 0&0&1\\ 0&0&-3\end{bmatrix} \xrightarrow{R_1-R_2, R_3+3R_2} \begin{bmatrix}0&2&0\\ 0&0&1\\ 0&0&0\end{bmatrix} \xrightarrow{R_1/2} \begin{bmatrix}0&1&0\\ 0&0&1\\ 0&0&0\end{bmatrix}. So x2=0,x3=0x_2=0, x_3=0. x1x_1 is free. Let x1=tx_1=t. Eigenvectors v=t[100]\vec{v}=t\begin{bmatrix}1\\0\\0\end{bmatrix}. Basis for Eλ1E_{\lambda_1} is {[100]}\{ \begin{bmatrix}1\\0\\0\end{bmatrix} \}. For λ2=2:(A(2)I)v=(A+2I)v=0\lambda_{2}=-2: (A-(-2)I)\vec{v}=(A+2I)\vec{v}=\vec{0} A+2I=[1+22101+22002+2]=[321032000]A+2I = \begin{bmatrix}1+2&2&1\\ 0&1+2&2\\ 0&0&-2+2\end{bmatrix} = \begin{bmatrix}3&2&1\\ 0&3&2\\ 0&0&0\end{bmatrix}. RREF: [321032000]R2/3[321012/3000]R12R2[3014/3012/3000]=[301/3012/3000]R1/3[101/9012/3000]\begin{bmatrix}3&2&1\\ 0&3&2\\ 0&0&0\end{bmatrix} \xrightarrow{R_2/3} \begin{bmatrix}3&2&1\\ 0&1&2/3\\ 0&0&0\end{bmatrix} \xrightarrow{R_1-2R_2} \begin{bmatrix}3&0&1-4/3\\ 0&1&2/3\\ 0&0&0\end{bmatrix} = \begin{bmatrix}3&0&-1/3\\ 0&1&2/3\\ 0&0&0\end{bmatrix} \xrightarrow{R_1/3} \begin{bmatrix}1&0&-1/9\\ 0&1&2/3\\ 0&0&0\end{bmatrix}. So x119x3=0x1=19x3x_1 - \frac{1}{9}x_3=0 \Rightarrow x_1 = \frac{1}{9}x_3. x2+23x3=0x2=23x3x_2 + \frac{2}{3}x_3=0 \Rightarrow x_2 = -\frac{2}{3}x_3. Let x3=tx_3=t. Eigenvectors v=t[1/92/31]\vec{v}=t\begin{bmatrix}1/9\\ -2/3\\ 1\end{bmatrix}. Basis for Eλ2E_{\lambda_2} is {[1/92/31]}\{ \begin{bmatrix}1/9\\ -2/3\\ 1\end{bmatrix} \}.

Theorem 6.2.13

If A is an n×nn\times n upper or lower triangular matrix, then the eigenvalues of A are the diagonal entries of A.

Let A be an n×nn\times n matrix with eigenvalue λ1\lambda_{1}. The algebraic multiplicity of λ1\lambda_{1}, denoted aλ1a_{\lambda_{1}}, is the number of times that λ1\lambda_{1} is a root of the characteristic polynomial C(λ)C(\lambda). That is, if C(λ)=(λλ1)kC1(λ)C(\lambda)=(\lambda-\lambda_{1})^{k}C_{1}(\lambda), where C1(λ1)0C_{1}(\lambda_{1})\ne0, then aλ1=k.a_{\lambda_{1}}=k. The geometric multiplicity of λ1\lambda_1, denoted gλ1g_{\lambda_{1}}, is the dimension of its eigenspace. So, gλ1=dim(Eλ1)g_{\lambda_{1}}=dim(E_{\lambda_{1}}).

e.g. For the matrix A=[121012002]A=\begin{bmatrix}1&2&1\\ 0&1&2\\ 0&0&-2\end{bmatrix}, which is upper triangular, we have λ1=1\lambda_{1}=1 and λ2=2,\lambda_{2}=-2, with algebraic multiplicities aλ1=2a_{\lambda_{1}}=2 and aλ2=1a_{\lambda_{2}}=1. Since Eλ1=Span{[100]}E_{\lambda_{1}}=\text{Span}\{[\begin{smallmatrix}1\\ 0\\ 0\end{smallmatrix}]\} and Eλ2=Span{[1/92/31]}E_{\lambda_{2}}=\text{Span}\{[\begin{smallmatrix}1/9\\ -2/3\\ 1\end{smallmatrix}]\}, the geometric multiplicities are gλ1=dim Eλ1=1g_{\lambda_{1}}=\text{dim }E_{\lambda_{1}}=1 and gλ2=dim Eλ2=1.g_{\lambda_{2}}=\text{dim }E_{\lambda_{2}}=1.

e.g. Find the geometric and algebraic multiplicity of all eigenvalues of A=[163101365]A = \begin{bmatrix}-1&6&3\\ 1&-0&-1\\ -3&6&5\end{bmatrix}. (Middle row has 0-0, assuming it's 00). So A=[163101365]A = \begin{bmatrix}-1&6&3\\ 1&0&-1\\ -3&6&5\end{bmatrix}. CA(λ)=det(AλI)=1λ631λ1365λC_A(\lambda) = \text{det}(A-\lambda I) = \begin{vmatrix}-1-\lambda&6&3\\ 1&-\lambda&-1\\ -3&6&5-\lambda\end{vmatrix}. CA(λ)=(1λ)[λ(5λ)+6]1[6(5λ)18]+(3)[6(3λ)]C_A(\lambda) = (-1-\lambda)[-\lambda(5-\lambda)+6] - 1[6(5-\lambda)-18] + (-3)[-6 - (-3\lambda)] (expansion along first column). =(1λ)(5λ+λ2+6)(306λ18)(3)(6+3λ)= (-1-\lambda)(-5\lambda+\lambda^2+6) - (30-6\lambda-18) - (-3)(-6+3\lambda) =(1λ)(λ25λ+6)(126λ)+189λ= (-1-\lambda)(\lambda^2-5\lambda+6) - (12-6\lambda) + 18-9\lambda =(1λ)(λ2)(λ3)12+6λ+189λ= (-1-\lambda)(\lambda-2)(\lambda-3) -12+6\lambda+18-9\lambda =(1λ)(λ2)(λ3)+63λ= (-1-\lambda)(\lambda-2)(\lambda-3) +6-3\lambda =(1λ)(λ25λ+6)3(λ2)= (-1-\lambda)(\lambda^2-5\lambda+6) -3(\lambda-2) =λ3+5λ26λλ2+5λ63λ+6= -\lambda^3+5\lambda^2-6\lambda-\lambda^2+5\lambda-6 -3\lambda+6 =λ3+4λ24λ=λ(λ24λ+4)=λ(λ2)2= -\lambda^3+4\lambda^2-4\lambda = -\lambda(\lambda^2-4\lambda+4) = -\lambda(\lambda-2)^2. So eigenvalues are λ1=0\lambda_1=0 (algebraic multiplicity aλ1=1a_{\lambda_1}=1) and λ2=2\lambda_2=2 (algebraic multiplicity aλ2=2a_{\lambda_2}=2). For λ1=0:(A0I)v=Av=0\lambda_1=0: (A-0I)\vec{v} = A\vec{v}=\vec{0}. [163101365]RREF[101011/3000]\begin{bmatrix}-1&6&3\\ 1&0&-1\\ -3&6&5\end{bmatrix} \xrightarrow{RREF} \begin{bmatrix}1&0&-1\\ 0&1&1/3\\ 0&0&0\end{bmatrix}. RREF is [101011/3000]\begin{bmatrix}1&0&-1\\0&1&1/3\\0&0&0\end{bmatrix} x1x3=0x1=x3x_1-x_3=0 \Rightarrow x_1=x_3. x2+1/3x3=0x2=1/3x3x_2+1/3 x_3=0 \Rightarrow x_2=-1/3 x_3. Let x3=tx_3=t. v=t[11/31]\vec{v}=t\begin{bmatrix}1\\-1/3\\1\end{bmatrix}. Eλ1=Span{[11/31]}E_{\lambda_1} = \text{Span}\{ \begin{bmatrix}1\\-1/3\\1\end{bmatrix} \}. So gλ1=1g_{\lambda_1}=1. For λ2=2:(A2I)v=0\lambda_2=2: (A-2I)\vec{v}=\vec{0}. A2I=[126310213652]=[363121363]A-2I = \begin{bmatrix}-1-2&6&3\\ 1&0-2&-1\\ -3&6&5-2\end{bmatrix} = \begin{bmatrix}-3&6&3\\ 1&-2&-1\\ -3&6&3\end{bmatrix}. RREF: [363121363]R1/(3)[121121363]R2R1,R3+3R1[121000000]\begin{bmatrix}-3&6&3\\ 1&-2&-1\\ -3&6&3\end{bmatrix} \xrightarrow{R_1/(-3)} \begin{bmatrix}1&-2&-1\\ 1&-2&-1\\ -3&6&3\end{bmatrix} \xrightarrow{R_2-R_1, R_3+3R_1} \begin{bmatrix}1&-2&-1\\ 0&0&0\\ 0&0&0\end{bmatrix}. x12x2x3=0x1=2x2+x3x_1-2x_2-x_3=0 \Rightarrow x_1=2x_2+x_3. Let x2=s,x3=tx_2=s, x_3=t. v=[2s+tst]=s[210]+t[101]\vec{v}=\begin{bmatrix}2s+t\\s\\t\end{bmatrix} = s\begin{bmatrix}2\\1\\0\end{bmatrix} + t\begin{bmatrix}1\\0\\1\end{bmatrix}. Eλ2=Span{[210],[101]}E_{\lambda_2} = \text{Span}\{ \begin{bmatrix}2\\1\\0\end{bmatrix}, \begin{bmatrix}1\\0\\1\end{bmatrix} \}. So gλ2=2g_{\lambda_2}=2.

Lemma 6.2.20

Let A and B be similar matrices, then A and B have the same characteristic polynomial, and hence the same eigenvalues.

Theorem 6.2.21

If A is an n×nn\times n matrix with eigenvalue λ1\lambda_{1}, then 1gλ1aλ11\le g_{\lambda_{1}}\le a_{\lambda_{1}}.

6.3 Diagonalization

An n×nn\times n matrix AMn×n(R)A\in M_{n\times n}(\mathbb{R}) is said to be diagonalizable if A is similar to a diagonal matrix DMn×n(R)D\in M_{n\times n}(\mathbb{R}). If P1AP=DP^{-1}AP=D, then we say that P diagonalizes A.

Remark: For now, we will restrict ourselves to diagonalizing real matrices with real eigenvalues. That is, if A has a non-real eigenvalue, then we will say that A is not diagonalizable over R\mathbb{R}. In Section 6.5, we will look at diagonalizing matrices over the complex numbers.

Theorem 6.3.2

An n×nn\times n matrix A is diagonalizable (over R\mathbb{R}) if and only if there exists a basis {v1,...,vn}\{\vec{v}_{1},...,\vec{v}_{n}\} for Rn\mathbb{R}^{n} of eigenvectors of A.

e.g. Consider the mapping L:R2R2L:\mathbb{R}^{2}\rightarrow\mathbb{R}^{2} that rotates vectors about the diagonal y=xy=x. (This is reflection across y=xy=x). Its standard matrix [L]=[0110][L]=\begin{bmatrix}0&1\\ 1&0\end{bmatrix} as explored on a previous example, where we found the eigenpairs (λ1,v1)(\lambda_{1},\vec{v}_{1}) and (λ2,v2)(\lambda_{2},\vec{v}_{2}), with λ1=1,λ2=1\lambda_{1}=1,\lambda_{2}=-1 and v1=[11],v2=[11].\vec{v}_{1}=\begin{bmatrix}1\\ 1\end{bmatrix}, \vec{v}_{2}=\begin{bmatrix}1\\ -1\end{bmatrix}. Since we have got 2 eigenvectors, we can write a basis of eigenvectors B={v1,v2}\mathcal{B}=\{\vec{v}_{1},\vec{v}_{2}\}. What is then the B\mathcal{B}-matrix of L? Well, L(v1)=λ1v1L(\vec{v}_{1})=\lambda_{1}\vec{v}_{1} and L(v2)=λ2v2L(\vec{v}_{2})=\lambda_{2}\vec{v}_{2}, so [L(v1)]B=[λ10][L(\vec{v}_{1})]_{\mathcal{B}}=\begin{bmatrix}\lambda_{1}\\ 0\end{bmatrix} and [L(v2)]B=[0λ2][L]B=[λ100λ2]=[1001][L(\vec{v}_{2})]_{\mathcal{B}}=\begin{bmatrix}0\\ \lambda_{2}\end{bmatrix}\Rightarrow[L]_{\mathcal{B}}=\begin{bmatrix}\lambda{1}&0\\ 0&\lambda_{2}\end{bmatrix} = \begin{bmatrix}1&0\\0&-1\end{bmatrix}. Let S\mathcal{S} be the standard basis. We have [L]S=SPB[L]BBPS[L]_{\mathcal{S}} = {}{\mathcal{S}}P_{\mathcal{B}} [L]_{\mathcal{B}} {}{\mathcal{B}}P_{\mathcal{S}}. Here SPB=P=[v1v2]=[1111]{}{\mathcal{S}}P{\mathcal{B}} = P = [\vec{v}_1 \vec{v}_2] = \begin{bmatrix}1&1\\1&-1\end{bmatrix}. Then BPS=P1=11(1)1(1)[1111]=12[1111]=[1/21/21/21/2]{}{\mathcal{B}}P{\mathcal{S}} = P^{-1} = \frac{1}{1(-1)-1(1)}\begin{bmatrix}-1&-1\\-1&1\end{bmatrix} = \frac{1}{-2}\begin{bmatrix}-1&-1\\-1&1\end{bmatrix} = \begin{bmatrix}1/2&1/2\\1/2&-1/2\end{bmatrix}. So [0110]=[1111][1001][1/21/21/21/2]\begin{bmatrix}0&1\\ 1&0\end{bmatrix}=\begin{bmatrix}1&1\\ 1&-1\end{bmatrix}\begin{bmatrix}1&0\\ 0&-1\end{bmatrix}\begin{bmatrix}1/2&1/2\\ 1/2&-1/2\end{bmatrix}. And [L][L] is diagonalizable.

Lemma 6.3.3

If A is an n×nn\times n matrix with eigenpairs (λ1,v1),(λ2,v2),...,(λk,vk)(\lambda_{1},\vec{v}_{1}), (\lambda_{2},\vec{v}_{2}),..., (\lambda_{k},\vec{v}_{k}) where λiλj\lambda_{i}\ne\lambda_{j} for iji\ne j, then {v1,...,vk}\{\vec{v}_{1},...,\vec{v}_{k}\} is linearly independent.

Theorem 6.3.4

If A is an n×nn\times n matrix with distinct eigenvalues λ1,...,λk\lambda_{1},...,\lambda_{k} and Bi={vi,1,...,vi,gλi}\mathcal{B}{i}=\{\vec{v}_{i},{1},...,\vec{v}_{i},{g{\lambda_{i}}}\} is a basis for the eigenspace of λi\lambda_{i} for 1ik1\le i\le k, then B1B2...Bk\mathcal{B}{1}\cup\mathcal{B}{2}\cup...\cup\mathcal{B}_{k} is a linearly independent set.

Diagonalizability Test

If A is an n×nn\times n matrix whose characteristic polynomial factors as CA(λ)=(λλ1)aλ1(λλk)aλkC_{A}(\lambda)=(\lambda-\lambda_{1})^{a_{\lambda_{1}}}\cdot\cdot\cdot(\lambda-\lambda_{k})^{a_{\lambda_{k}}} where λ1,...,λk\lambda_{1},...,\lambda_{k} are the distinct eigenvalues of A, then A is diagonalizable if and only if gλi=aλig_{\lambda_{i}}=a_{\lambda_{i}} for 1ik1\le i\le k.

Corollary 6.3.6

If A is an n×nn\times n matrix with n distinct eigenvalues, then A is diagonalizable.

Algorithm To diagonalize an n×nn\times n matrix A, or show that A is not diagonalizable:

  1. Find and factor the characteristic polynomial C(λ)=det(AλI)C(\lambda)=det(A-\lambda I).
  2. Let λ1,...,λn\lambda_{1},...,\lambda_{n} denote the n roots of C(λ)C(\lambda) (repeated according to multiplicity).
  3. If any of the eigenvalues λi\lambda_{i} are not real, then A is not diagonalizable over R\mathbb{R}.
  4. Find a basis for the eigenspace of each distinct eigenvalue λj\lambda_{j} by finding a basis for the nullspace of AλjIA-\lambda_{j}I.
  5. If gλj<aλjg_{\lambda_{j}}<a_{\lambda_{j}} for any λj\lambda_{j}, then A is not diagonalizable.
  6. Otherwise, form a basis {v1,...,vn}\{\vec{v}_{1},...,\vec{v}_{n}\} for Rn\mathbb{R}^{n} of eigenvectors of A by using Theorem 6.3.4. Let P=[v1vn]P=[\begin{matrix}\vec{v}_{1}&\cdot\cdot\cdot&\vec{v}_{n}\end{matrix}].
  7. Then, P1AP=diag(λ1,...,λn)P^{-1}AP=diag(\lambda_{1},...,\lambda_{n}) where λi\lambda_{i} is an eigenvalue corresponding to the eigenvector vi\vec{v}_{i} for 1in1\le i\le n.

e.g. Show that A=[163101365]A=\begin{bmatrix}-1&6&3\\ 1&0&-1\\ -3&6&5\end{bmatrix} is diagonalizable and find an invertible matrix P and a diagonal matrix D such that P1AP=DP^{-1}AP=D. From earlier, CA(λ)=λ(λ2)2C_{A}(\lambda)=-\lambda(\lambda-2)^{2}. Eigenvalues: λ1=0\lambda_{1}=0 (aλ1=1a_{\lambda_{1}}=1) and λ2=2\lambda_{2}=2 (aλ2=2a_{\lambda_{2}}=2). For λ1=0\lambda_1=0, Eλ1=Span{[11/31]}E_{\lambda_1} = \text{Span}\{ \begin{bmatrix}1\\-1/3\\1\end{bmatrix} \}. So gλ1=1g_{\lambda_1}=1. Since aλ1=gλ1=1a_{\lambda_1}=g_{\lambda_1}=1. For λ2=2\lambda_2=2, Eλ2=Span{[210],[101]}E_{\lambda_2} = \text{Span}\{ \begin{bmatrix}2\\1\\0\end{bmatrix}, \begin{bmatrix}1\\0\\1\end{bmatrix} \}. So gλ2=2g_{\lambda_2}=2. Since aλ2=gλ2=2a_{\lambda_2}=g_{\lambda_2}=2. Since algebraic and geometric multiplicities match for all eigenvalues, A is diagonalizable. P=[1211/310101]P=\begin{bmatrix}1&2&1\\ -1/3&1&0\\ 1&0&1\end{bmatrix}. D=[000020002]D=\begin{bmatrix}0&0&0\\ 0&2&0\\ 0&0&2\end{bmatrix}.

e.g. Show that A=[1101]A=\begin{bmatrix}1&1\\ 0&1\end{bmatrix} is not diagonalizable. CA(λ)=1λ101λ=(1λ)2=0C_{A}(\lambda)=\begin{vmatrix}1-\lambda&1\\ 0&1-\lambda\end{vmatrix}=(1-\lambda)^{2}=0. Eigenvalue λ=1\lambda=1 with aλ=2a_{\lambda}=2. For λ=1:(AI)v=0[0100][x1x2]=[00]\lambda=1: (A-I)\vec{v}=\vec{0} \Rightarrow \begin{bmatrix}0&1\\0&0\end{bmatrix}\begin{bmatrix}x_1\\x_2\end{bmatrix}=\begin{bmatrix}0\\0\end{bmatrix}. This gives x2=0x_2=0. x1x_1 is free. Let x1=tx_1=t. Eigenvectors are t[10]t\begin{bmatrix}1\\0\end{bmatrix}. Eλ=Span{[10]}E_{\lambda} = \text{Span}\{ \begin{bmatrix}1\\0\end{bmatrix} \}. So gλ=1g_{\lambda}=1. Since gλ=1<aλ=2g_{\lambda}=1 < a_{\lambda}=2, A is not diagonalizable.

e.g. Show that A=[0110]A=\begin{bmatrix}0&-1\\ 1&0\end{bmatrix} is not diagonalizable over R\mathbb{R}. CA(λ)=λ11λ=λ2+1=0C_{A}(\lambda)=\begin{vmatrix}-\lambda&-1\\ 1&-\lambda\end{vmatrix}=\lambda^{2}+1=0. λ2=1λ=±i\lambda^{2}=-1 \Rightarrow \lambda=\pm i. Since eigenvalues are not real numbers, A is not diagonalizable over R\mathbb{R}. P1AP=[i00i]P^{-1}AP=\begin{bmatrix}i&0\\0&-i\end{bmatrix} complex diagonalization.

Theorem 6.3.13

If λ1,...,λn\lambda_{1},...,\lambda_{n} are all the n eigenvalues of an n×nn\times n matrix A (repeated according to algebraic multiplicity), then det A=λ1λnA=\lambda_{1}\cdot\cdot\cdot\lambda_{n} and trA A=λ1++λnA=\lambda_{1}+\cdot\cdot\cdot+\lambda_{n}.

e.g. Find all eigenvalues of A=[100001010]A=\begin{bmatrix}1&0&0\\ 0&0&-1\\ 0&1&0\end{bmatrix} and verify that det A=λ1λ2λ3A=\lambda_{1}\lambda_{2}\lambda_{3} and trA A=λ1+λ2+λ3A=\lambda_{1}+\lambda_{2}+\lambda_{3}. CA(λ)=det(AλI)=1λ000λ101λ=(1λ)((λ)(λ)(1)(1))=(1λ)(λ2+1)C_A(\lambda) = \text{det}(A-\lambda I) = \begin{vmatrix}1-\lambda&0&0\\ 0&-\lambda&-1\\ 0&1&-\lambda\end{vmatrix} = (1-\lambda)((-\lambda)(-\lambda) - (-1)(1)) = (1-\lambda)(\lambda^2+1). Eigenvalues are λ1=1,λ2=i,λ3=i\lambda_1=1, \lambda_2=i, \lambda_3=-i. det A = 1(0(1))0+0=11(0-(-1)) - 0 + 0 = 1. λ1λ2λ3=1i(i)=i2=(1)=1\lambda_1\lambda_2\lambda_3 = 1 \cdot i \cdot (-i) = -i^2 = -(-1) = 1. Verified. tr A = 1+0+0=11+0+0=1. λ1+λ2+λ3=1+i+(i)=1\lambda_1+\lambda_2+\lambda_3 = 1+i+(-i)=1.

6.4 Powers of Matrices

Theorem 6.4.1 Let A be an n×nn\times n matrix. If there exists a matrix P and diagonal matrix D such that P1AP=DP^{-1}AP=D, then Ak=PDkP1A^{k}=PD^{k}P^{-1} Let A=[1214]A=\begin{bmatrix}1&2\\ -1&4\end{bmatrix}. Show that A1000=PD1000P1=[210013100021001+231000210003100021000+231000]A^{1000}=P D^{1000} P^{-1} = \begin{bmatrix}2^{1001}-3^{1000} & -2^{1001}+2 \cdot 3^{1000} \\ 2^{1000}-3^{1000} & -2^{1000}+2 \cdot 3^{1000}\end{bmatrix}. (This requires finding P and D for A first.) CA(λ)=1λ214λ=(1λ)(4λ)(2)(1)=4λ4λ+λ2+2=λ25λ+6=(λ2)(λ3)C_A(\lambda) = \begin{vmatrix}1-\lambda&2\\-1&4-\lambda\end{vmatrix} = (1-\lambda)(4-\lambda) - (2)(-1) = 4-\lambda-4\lambda+\lambda^2+2 = \lambda^2-5\lambda+6 = (\lambda-2)(\lambda-3). Eigenvalues λ1=2,λ2=3\lambda_1=2, \lambda_2=3. For λ1=2:A2I=[1212][1200]x12x2=0v1=[21]\lambda_1=2: A-2I = \begin{bmatrix}-1&2\\-1&2\end{bmatrix} \rightarrow \begin{bmatrix}1&-2\\0&0\end{bmatrix}x_1-2x_2=0 \Rightarrow \vec{v}_1 = \begin{bmatrix}2\\1\end{bmatrix}. For λ2=3:A3I=[2211][1100]x1x2=0v2=[11]\lambda_2=3: A-3I = \begin{bmatrix}-2&2\\-1&1\end{bmatrix} \rightarrow \begin{bmatrix}1&-1\\0&0\end{bmatrix}x_1-x_2=0 \Rightarrow \vec{v}_2 = \begin{bmatrix}1\\1\end{bmatrix}. P=[2111]P=\begin{bmatrix}2&1\\1&1\end{bmatrix}, D=[2003]D=\begin{bmatrix}2&0\\0&3\end{bmatrix}. P1=121[1112]=[1112]P^{-1} = \frac{1}{2-1}\begin{bmatrix}1&-1\\-1&2\end{bmatrix} = \begin{bmatrix}1&-1\\-1&2\end{bmatrix}. D1000=[210000031000]D^{1000} = \begin{bmatrix}2^{1000}&0\\0&3^{1000}\end{bmatrix}. A1000=PD1000P1=[2111][210000031000][1112]A^{1000} = PD^{1000}P^{-1} = \begin{bmatrix}2&1\\1&1\end{bmatrix} \begin{bmatrix}2^{1000}&0\\0&3^{1000}\end{bmatrix} \begin{bmatrix}1&-1\\-1&2\end{bmatrix} =[221000131000121000131000][1112]=[21001310002100031000][1112]= \begin{bmatrix}2 \cdot 2^{1000}&1 \cdot 3^{1000}\\1 \cdot 2^{1000}&1 \cdot 3^{1000}\end{bmatrix} \begin{bmatrix}1&-1\\-1&2\end{bmatrix} = \begin{bmatrix}2^{1001}&3^{1000}\\2^{1000}&3^{1000}\end{bmatrix} \begin{bmatrix}1&-1\\-1&2\end{bmatrix} =[21001(1)+31000(1)21001(1)+31000(2)21000(1)+31000(1)21000(1)+31000(2)]=[210013100021001+231000210003100021000+231000]= \begin{bmatrix}2^{1001}(1)+3^{1000}(-1) & 2^{1001}(-1)+3^{1000}(2) \\ 2^{1000}(1)+3^{1000}(-1) & 2^{1000}(-1)+3^{1000}(2)\end{bmatrix} = \begin{bmatrix}2^{1001}-3^{1000} & -2^{1001}+2 \cdot 3^{1000}\\ 2^{1000}-3^{1000} & -2^{1000}+2 \cdot 3^{1000}\end{bmatrix}.

6.5 Complex Diagonalization

Consider the standard matrix of the rotation mapping Rθ:R2R2R_{\theta}:\mathbb{R}^{2}\rightarrow\mathbb{R}^{2}: [Rθ]=[cosθsinθsinθcosθ](0θ<2π)[R_{\theta}]=\begin{bmatrix}\cos\theta&-\sin\theta\\ \sin\theta&\cos\theta\end{bmatrix} \quad (0\le\theta<2\pi) (a) Thinking geometrically, should this matrix have any real eigenvalues? (Generally no, unless θ=0\theta=0 or θ=π\theta=\pi. A rotation changes the direction of every vector, unless the rotation is by 0 or π\pi radians.) (b) Can you confirm your answer to (a) by studying the roots of C(λ)C(\lambda)? C(λ)=cosθλsinθsinθcosθλ=(cosθλ)2(sinθ)(sinθ)C(\lambda) = \begin{vmatrix}\cos\theta-\lambda&-\sin\theta\\ \sin\theta&\cos\theta-\lambda\end{vmatrix} = (\cos\theta-\lambda)^2 - (-\sin\theta)(\sin\theta) =cos2θ2λcosθ+λ2+sin2θ=λ22λcosθ+1=0= \cos^2\theta - 2\lambda\cos\theta + \lambda^2 + \sin^2\theta = \lambda^2 - 2\lambda\cos\theta + 1 = 0. Roots: λ=2cosθ±4cos2θ42=2cosθ±4(cos2θ1)2=2cosθ±4sin2θ2\lambda = \frac{2\cos\theta \pm \sqrt{4\cos^2\theta - 4}}{2} = \frac{2\cos\theta \pm \sqrt{4(\cos^2\theta - 1)}}{2} = \frac{2\cos\theta \pm \sqrt{-4\sin^2\theta}}{2} =2cosθ±2isinθ2=cosθ±isinθ= \frac{2\cos\theta \pm 2i|\sin\theta|}{2} = \cos\theta \pm i|\sin\theta|. Real if sinθ=0\sin\theta=0, which means θ=0\theta=0 or θ=π\theta=\pi. If θ=0:R0=[1001]\theta=0: R_0 = \begin{bmatrix}1&0\\0&1\end{bmatrix}. Eigenvalue λ=1\lambda=1. Eλ1=Span{[10],[01]}E_{\lambda_1}=\text{Span}\{[\begin{smallmatrix}1\\0\end{smallmatrix}],[\begin{smallmatrix}0\\1\end{smallmatrix}]\}. If θ=π:Rπ=[1001]\theta=\pi: R_{\pi} = \begin{bmatrix}-1&0\\0&-1\end{bmatrix}. Eigenvalue λ=1\lambda=-1. Eλ2=Span{[10],[01]}E_{\lambda_2}=\text{Span}\{[\begin{smallmatrix}1\\0\end{smallmatrix}],[\begin{smallmatrix}0\\1\end{smallmatrix}]\} (eigenvectors are any non-zero vectors). Eλ2=span{[10],[01]}E_{\lambda_2}=span\{[\begin{smallmatrix}-1\\0\end{smallmatrix}],[\begin{smallmatrix}0\\-1\end{smallmatrix}]\}.

e.g. What if θ=π/2:λ=cos(π/2)±isin(π/2)=0±i(1)=±i\theta=\pi/2: \lambda = \cos(\pi/2) \pm i\sin(\pi/2) = 0 \pm i(1) = \pm i. For λ=i\lambda=i: (AiI)v=[i11i]v=0(A-iI)\vec{v} = \begin{bmatrix}-i&-1\\1&-i\end{bmatrix}\vec{v}=\vec{0}. [i11i]R2R1[1ii1]R2+iR1[1i00]\begin{bmatrix}-i&-1\\1&-i\end{bmatrix} \xrightarrow{R_2 \leftrightarrow R_1} \begin{bmatrix}1&-i\\-i&-1\end{bmatrix} \xrightarrow{R_2+iR_1} \begin{bmatrix}1&-i\\0&0\end{bmatrix}. x1ix2=0x1=ix2x_1-ix_2=0 \Rightarrow x_1=ix_2. Let x2=tx_2=t. v=t[i1]\vec{v}=t\begin{bmatrix}i\\1\end{bmatrix}. For A=[0110]A=\begin{bmatrix}0&-1\\1&0\end{bmatrix}, for λ=i\lambda=i: (AiI)=[i11i](A-iI) = \begin{bmatrix}-i&-1\\1&-i\end{bmatrix}. RREF [1i00]\begin{bmatrix}1&-i\\0&0\end{bmatrix}. x1ix2=0x1=ix2x_1-ix_2=0 \Rightarrow x_1=ix_2. Eigenvector [i1]\begin{bmatrix}i\\1\end{bmatrix}.

The set M2×2(C)M_{2\times2}(\mathbb{C}) is defined by M2×2(C)={A=[a11a12a21a22]a11,a12,a21,a22C}M_{2\times2}(\mathbb{C})=\{A=\begin{bmatrix}a_{11}&a_{12}\\ a_{21}&a_{22}\end{bmatrix}|a_{11},a_{12},a_{21},a_{22}\in\mathbb{C}\}. An element A of M2×2(C)M_{2\times2}(\mathbb{C}) is called a (complex) matrix. In M2×2(C)M_{2\times2}(\mathbb{C}), matrix addition and scalar multiplication follow as expected. Given A=[a11a12a21a22],B=[b11b12b21b22]A=\begin{bmatrix}a_{11}&a_{12}\\ a_{21}&a_{22}\end{bmatrix}, B=\begin{bmatrix}b_{11}&b_{12}\\ b_{21}&b_{22}\end{bmatrix} and αC\alpha\in\mathbb{C}

  1. vector addition: A+B=[a11+b11a12+b12a21+b21a22+b22]M2×2(C)A+B=\begin{bmatrix}a_{11}+b_{11}&a_{12}+b_{12}\\ a_{21}+b_{21}&a_{22}+b_{22}\end{bmatrix}\in M_{2\times2}(\mathbb{C})
  2. scalar multiplication: αA=[αa11αa12αa21αa22]M2×2(C)\alpha A=\begin{bmatrix}\alpha a_{11}&\alpha a_{12}\\ \alpha a_{21}&\alpha a_{22}\end{bmatrix}\in M_{2\times2}(\mathbb{C}) Matrix-vector and matrix-matrix products for complex matrices and vectors also follow as we did for R2\mathbb{R}^{2} and M2×2(R)M_{2\times2}(\mathbb{R}).

Theorem 1.6.3 (This theorem number is from Chapter 1, likely reused here for properties in Cn\mathbb{C}^n)

If x,y,wCn\vec{x}, \vec{y}, \vec{w}\in\mathbb{C}^{n} and c, dCd\in\mathbb{C}, then

  1. x+yCn\vec{x}+\vec{y}\in\mathbb{C}^{n}.
  2. (x+y)+w=x+(y+w)(\vec{x}+\vec{y})+\vec{w}=\vec{x}+(\vec{y}+\vec{w}).
  3. x+y=y+x\vec{x}+\vec{y}=\vec{y}+\vec{x}.
  4. There exists a vector 0Cn\vec{0}\in\mathbb{C}^{n} such that x+0=x\vec{x}+\vec{0}=\vec{x} for all xCn\vec{x}\in\mathbb{C}^{n}.
  5. For xCn\vec{x}\in\mathbb{C}^{n} there exists (x)Cn(-\vec{x})\in\mathbb{C}^{n} such that x+(x)=0\vec{x}+(-\vec{x})=\vec{0}.
  6. cxCnc\vec{x}\in\mathbb{C}^{n}.
  7. c(dx)=(cd)xc(d\vec{x})=(cd)\vec{x}.
  8. (c+d)x=cx+dx(c+d)\vec{x}=c\vec{x}+d\vec{x}.
  9. c(x+y)=cx+cyc(\vec{x}+\vec{y})=c\vec{x}+c\vec{y}.
  10. 1x=x1\vec{x}=\vec{x}.

e.g. The matrix A=[1111]A=\begin{bmatrix}1&-1\\ 1&1\end{bmatrix} has no real eigenvalues. We will view this as a matrix in M2×2(C)M_{2\times2}(\mathbb{C}) and search for complex eigenvalues and eigenvectors just as we did over R\mathbb{R}. (a) Show that the two complex roots of C(λ)C(\lambda) are λ1=1+i\lambda_{1}=1+i and λ2=1i\lambda_{2}=1-i. C(λ)=det(AλI)=1λ111λ=(1λ)2(1)(1)=(1λ)2+1C(\lambda) = \text{det}(A-\lambda I) = \begin{vmatrix}1-\lambda&-1\\1&1-\lambda\end{vmatrix} = (1-\lambda)^2 - (-1)(1) = (1-\lambda)^2+1. Set (1λ)2+1=0(1λ)2=11λ=±i(1-\lambda)^2+1=0 \Rightarrow (1-\lambda)^2=-1 \Rightarrow 1-\lambda = \pm i. So λ=1i\lambda = 1 \mp i. Thus λ1=1i,λ2=1+i\lambda_1 = 1-i, \lambda_2 = 1+i (or vice versa). (b) Determine all zC2\vec{z}\in\mathbb{C}^{2} such that Az=(1+i)zA\vec{z}=(1+i)\vec{z}. (A(1+i)I)z=0(A-(1+i)I)\vec{z}=\vec{0}. A(1+i)I=[1(1+i)111(1+i)]=[i11i]A-(1+i)I = \begin{bmatrix}1-(1+i)&-1\\1&1-(1+i)\end{bmatrix} = \begin{bmatrix}-i&-1\\1&-i\end{bmatrix}. [i11i]R1R2[1ii1]R2+iR1[1i00]\begin{bmatrix}-i&-1\\1&-i\end{bmatrix} \xrightarrow{R_1 \leftrightarrow R_2} \begin{bmatrix}1&-i\\-i&-1\end{bmatrix} \xrightarrow{R_2+iR_1} \begin{bmatrix}1&-i\\0&0\end{bmatrix}. z1iz2=0z1=iz2z_1 - iz_2 = 0 \Rightarrow z_1=iz_2. Let z2=tz_2=t. z=t[i1]\vec{z}=t\begin{bmatrix}i\\1\end{bmatrix}. (c) Determine all wC2\vec{w}\in\mathbb{C}^{2} such that Aw=(1i)wA\vec{w}=(1-i)\vec{w}. (A(1i)I)w=0(A-(1-i)I)\vec{w}=\vec{0}. A(1i)I=[1(1i)111(1i)]=[i11i]A-(1-i)I = \begin{bmatrix}1-(1-i)&-1\\1&1-(1-i)\end{bmatrix} = \begin{bmatrix}i&-1\\1&i\end{bmatrix}. [i11i]R1R2[1ii1]R2iR1[1i00]\begin{bmatrix}i&-1\\1&i\end{bmatrix} \xrightarrow{R_1 \leftrightarrow R_2} \begin{bmatrix}1&i\\i&-1\end{bmatrix} \xrightarrow{R_2-iR_1} \begin{bmatrix}1&i\\0&0\end{bmatrix}. w1+iw2=0w1=iw2w_1 + iw_2 = 0 \Rightarrow w_1=-iw_2. Let w2=tw_2=t. w=t[i1]\vec{w}=t\begin{bmatrix}-i\\1\end{bmatrix}. (d) Construct a matrix P=[zw]P=[\vec{z}\vec{w}] such that z\vec{z} is a nonzero solution from (b) and w\vec{w} is a nonzero solution from (c). Let t=1t=1 for both. P=[ii11]P=\begin{bmatrix}i&-i\\1&1\end{bmatrix}. (e) Determine a matrix P1P^{-1} such that PP1=I=P1PPP^{-1}=I=P^{-1}P. det P = i(1)(i)(1)=i+i=2ii(1) - (-i)(1) = i+i=2i. P1=12i[1i1i]=i2[1i1i]=[i/2i2/2i/2i2/2]=[i/21/2i/21/2]P^{-1} = \frac{1}{2i}\begin{bmatrix}1&i\\-1&i\end{bmatrix} = \frac{-i}{2}\begin{bmatrix}1&i\\-1&i\end{bmatrix} = \begin{bmatrix}-i/2 & -i^2/2 \\ i/2 & -i^2/2\end{bmatrix} = \begin{bmatrix}-i/2 & 1/2 \\ i/2 & 1/2\end{bmatrix}. (f) Calculate matrix P1APP^{-1}AP. This should be D=[1+i001i]D=\begin{bmatrix}1+i&0\\0&1-i\end{bmatrix}. (g) Can you use your work in (a) - (f) to calculate A100A^{100}? Yes. A=PDP1A=PDP^{-1}. So A100=PD100P1A^{100}=PD^{100}P^{-1}. D100=[(1+i)10000(1i)100]D^{100} = \begin{bmatrix}(1+i)^{100}&0\\0&(1-i)^{100}\end{bmatrix}. 1+i=2(cos(π/4)+isin(π/4))=2eiπ/41+i = \sqrt{2}(\cos(\pi/4)+i\sin(\pi/4)) = \sqrt{2}e^{i\pi/4}. (1+i)100=(2)100ei100π/4=250ei25π=250(cos(25π)+isin(25π))=250(1)=250(1+i)^{100} = (\sqrt{2})^{100} e^{i100\pi/4} = 2^{50} e^{i25\pi} = 2^{50} (\cos(25\pi)+i\sin(25\pi)) = 2^{50}(-1) = -2^{50}. 1i=2(cos(π/4)+isin(π/4))=2eiπ/41-i = \sqrt{2}(\cos(-\pi/4)+i\sin(-\pi/4)) = \sqrt{2}e^{-i\pi/4}. (1i)100=(2)100ei100π/4=250ei25π=250(cos(25π)+isin(25π))=250(1)=250(1-i)^{100} = (\sqrt{2})^{100} e^{-i100\pi/4} = 2^{50} e^{-i25\pi} = 2^{50} (\cos(-25\pi)+i\sin(-25\pi)) = 2^{50}(-1) = -2^{50}. So D100=[25000250]=250ID^{100} = \begin{bmatrix}-2^{50}&0\\0&-2^{50}\end{bmatrix} = -2^{50}I. A100=P(250I)P1=250PIP1=250I=[25000250]A^{100} = P(-2^{50}I)P^{-1} = -2^{50}PIP^{-1} = -2^{50}I = \begin{bmatrix}-2^{50}&0\\0&-2^{50}\end{bmatrix}.